Skip to main content

All Questions

1vote
0answers
23views

Convolutional network for multilabel classification in NLP

I am trying to label code snippets and I base on this article: https://arxiv.org/pdf/1906.01032.pdf My dataset is just code snippets (tokenized as ascii characters) and 500 different labels from ...
pbartkow's user avatar
0votes
2answers
5kviews

How to fine-tune GPT-J with small dataset

I have followed this guide as closely as possible: https://github.com/kingoflolz/mesh-transformer-jax I'm trying to fine-tune GPT-J with a small dataset of ~500 lines: ...
Ilya Karnaukhov's user avatar
1vote
1answer
3kviews

How is dropout applied to the embedding layer's output?

...
o_yeah's user avatar
0votes
2answers
4kviews

How to train an LSTM with varying length input?

I have a dataset where each of the training instances is different in the length and the data is sequential. So, I design an LSTM but I am thinking about how to train the LSTM. In fixed-length data, ...
Swakshar Deb's user avatar
3votes
1answer
793views

How large should the corpus be to optimally retrain the GPT-2 model?

I just started working with the GPT-2 models and want to retrain one on a pretty narrow topic, so I have problems finding training material. How large should the corpus be to optimally retrain the GPT-...
Andreas Toresäter's user avatar
1vote
0answers
131views

Embedding Layer into Convolution Layer

I'm looking to encode PDF documents for deep learning such that an image representation of the PDF refers to word embeddings instead of graphic data So I've indexed a relatively small vocabulary (88 ...
NewEndian's user avatar
1vote
0answers
43views

Low accuracy during training for text summarization

I am trying to implement an extractive text summarization model. I am using keras and tensorflow. I have used bert sentence embeddings and the embeddings are fed into an LSTM layer and then to a Dense ...
inquisitive's user avatar
2votes
2answers
4kviews

Why is my loss (binary cross entropy) converging on ~0.6? (Task: Natural Language Inference)

I’m trying to debug my neural network (BERT fine-tuning) trained for natural language inference with binary classification of either entailment or contradiction. I've trained it for 80 epochs and its ...
Jack-P's user avatar
1vote
0answers
232views

Simple sequential model with LSTM which doesn't converge

I'm actually trying to create a sequential neural network in order to translate a "human" sentence in a "machine" sentence understandable by an algorithm. Like It didn't work, I've try to create a NN ...
user33665's user avatar
3votes
0answers
497views

How to use TPU for real-time low-latency inference?

I use Google's Cloud TPU hardware extensively using Tensorflow for training models and inference, however, when I run inference I do it in large batches. The TPU takes about 3 minutes to warm up ...
adng's user avatar
1vote
2answers
449views

Why is embedding important in NLP, and how does autoencoder work?

People say embedding is necessary in NLP because if using just the word indices, the efficiency is not high as similar words are supposed to be related to each other. However, I still don't truly get ...
Dan D's user avatar
  • 1,318
0votes
1answer
58views

How to change this RNN text classification code to become text generation code?

I can do text classification with RNN, in which the last output of RNN (rnn_outputs[-1]) is used to matmul with output layer weight and plus bias. That is getting a word (class name) after the last T ...
Dan D's user avatar
  • 1,318
1vote
1answer
72views

Do I need to use a pre-processed dataset to classify comments?

I want to use Machine Learning for text classification, more precisely, I want to determine whether a text (or comment) is positive or negative. I can download a dataset with 120 million comments. I ...
user avatar
5votes
1answer
175views

Which model should I use to determine the similarity between predefined sentences and new sentences?

The Levenshtein algorithm and some ratio and proportion may handle this use case. Based on the pre-defined sequence of statements, such as "I have a dog", "I own a car" and many more, I must ...
alona's user avatar
2votes
0answers
44views

Sequence to sequence machine learning / NMT - converting numbers into words

I want to do some sequence to sequence modelling on source data that looks like this: /-0.013428/-0.124969/-0.13435/0.008087/-0.269241/-0.36849/ with target data ...
spaces_'s user avatar

153050per page
close